perm filename PATREC[4,KMC]9 blob
sn#101956 filedate 1974-05-13 generic text, type T, neo UTF8
00100
00200
00300 PATTERN-MATCHING RULES FOR THE RECOGNITION OF
00400 NATURAL LANGUAGE DIALOGUE EXPRESSIONS
00500
00600 Kenneth Mark Colby
00700 and
00800 Roger C. Parkison
00900
01000
01100 INTRODUCTION
01200
01300 To recognize something is to identify it as an instance of
01400 the "same again". This familiarity is possible because of recurrent
01500 characteristics of the world which repeat themselves over and over
01600 again. We shall describe an algorithm which recognizes recurrent
01700 characteristics of natural language dialogue expressions. It utilizes
01800 a multi-stage sequence of pattern-matching rules for progressively
01900 transforming these input expressions into a pattern which eventually
02000 best matches a more abstract stored pattern. The name of the stored
02100 pattern has a pointer to the name of a response function in memory
02200 which decides what to do once the input has been recognized. Here
02300 we shall discuss only the recognizing functions, except for one
02400 response function (anaphoric substitution) which interactively aids
02500 the recognition process. Details of how the response functions
02600 operate will be described in a future communication by William
02700 Faught.
02800 In constructing and testing a simulation of paranoid
02900 processes, we were faced with the problem of reproducing paranoid
03000 linguistic behavior in a teletyped diagnostic psychiatric interview.
03100 The diagnosis of paranoid states, reactions or modes is made by
03200 clinicians who judge a degree of correspondence between what they
03300 observe linguistically in an interview and their conceptual model of
03400 paranoid behavior. There exists a high degree of agreement among
03500 psychiatrists about this conceptual model which relies mainly on what
03600 an interviewee says and how he says it.
03700 Natural language is a life-expressing code people use for
03800 communication with themselves and others. In a real-life dialogue
03900 such as a psychiatric interview, the participants have interests,
04000 intentions, and expectations which are revealed in their linguistic
04100 expressions. To produce effects on an interviewer which he would
04200 judge similar to the effects produced by a paranoid patient , an
04300 interactive simulation of a paranoid patient must be able to
04400 demonstrate typical paranoid interview behavior. To achieve the
04500 desired effects, the paranoid model must have the ability to deal
04600 with the teletyped linguistic behavior of the interviewer.
04700 There are a number of approaches one might consider for an
04800 ideal handling of dialogue expressions. One approach would be
04900 to construct a dictionary of all expressions which could possibly
05000 arise in an interview. Associated with each expression would be its
05100 interpretation, depending on dialogue context, which in turn must
05200 somehow be defined. For obvious reasons, no one takes this approach
05300 seriously. Instead of an expression dictionary, one might
05400 construct a word dictionary and then use projection rules to yield an
05500 interpretation of a sentence from the dictionary definitions. This
05600 has been the approach adopted by most linguistic parsers based on
05700 transformational or systemic grammars. Such a method performs
05800 adequately as long as the dictionary involves only a few hundred
05900 words, each word having only one or two senses, and the dialogue is
06000 limited to a mini-world of a few objects and relations. But the
06100 problems which arise in a real-time psychiatric interview conducted
06200 in unrestricted English are too great for this method to be useful in
06300 actual dialogues requiring immediate responses.
06400 There is little consensus knowledge about how humans process
06500 natural language. They can be shown to possess some knowledge of
06600 grammar rules but this does not entail that they use a grammar in
06700 interpreting and producing language. It seems implausible to us that
06800 people possess full transformational grammars for processing
06900 language. Language is what is recognized but the processes
07000 involved may not be linguistic or grammatical. Originally
07100 transformational grammars were not designed to "understand" a large
07200 subset of English; they represented a formal method for deciding
07300 whether a string is grammatical.
07400 An analysis of what one's problem actually is should guide
07500 the selection or invention of methods appropriate to that problem's
07600 solution. Our problem was not to develop a consistent and general
07700 theory of language nor to assert empirically testable hypotheses
07800 about how people process language. Our problem was to design an
07900 algorithm which recognizes what is being said in a dialogue and what
08000 is being said about it in order to make a response such that a sample
08100 of I-O pairs from the paranoid model is judged similar to a sample of
08200 I-O pairs from paranoid patients. The design task was one for
08300 artificial intelligence which attempts to get computers to perform
08400 human intellectual functions. New methods had to be devised for an
08500 algorithm to participate in a human dialogue in a
08600 paranoid-patient-like way. We are not claiming that our methods
08700 represent the way people process language. We sought effective
08800 methods which could operate efficiently in real time. Since our
08900 methods provide a general way of many-to-one mapping from surface
09000 expressions to a single stored pattern, these methods can be used by
09100 any type of "host" system which takes natural language as input.
09200 To perform the task of managing communicative uses and
09300 effects of natural language, we developed a strategy of transforming
09400 the input until a pattern is achieved which matches completely or
09500 partially a more abstract stored pattern. This strategy has proved
09600 adequate for our purposes a satisfactory percentage of the time. (No
09700 one expects any method to be successful 100% of the time since not
09800 even humans, the best natural language processors around, achieve
09900 this level of performance). The power of this method for natural
10000 language dialogues lies in its ability to ignore as irrelevant some
10100 of what it recognizes and everything it does not recognize at all. A
10200 conventional parser doing word-by-word, parts-of-speech analysis
10300 fails when it cannot find one or more of the input words in its
10400 dictionary. Because it must know every word, it is too fragile for
10500 unrestricted dialogues.
10600 In early versions of the paranoid model, (PARRY1), many (but
10700 not all) of the pattern recognition mechanisms allowed the elements
10800 of the pattern to be order independent. (Colby, Weber, and Hilf,
10900 1971). For example, consider the following expressions:
11000 (1) WHERE DO YOU WORK?
11100 (2) WHAT SORT OF WORK DO YOU DO?
11200 (3) WHAT IS YOUR OCCUPATION?
11300 (4) WHAT DO YOU DO FOR A LIVING?
11400 (5) WHERE ARE YOU EMPLOYED?
11500 In PARRY1 a procedure would scan these expressions looking for an
11600 information-bearing contentive such as "work", "for a living", etc.
11700 If it found such a contentive along with a "you" or "your" in the
11800 expression, regardless of word order, it would respond to the
11900 expression as if it were a question about the nature of one's work.
12000 This method correctly classifies the 5 sentences above.
12100 Unfortunately, it includes the 2 examples below in the same category:
12200 (6) DOES YOUR FATHER'S CAR WORK?
12300 (7) HOW DID THINGS WORK OUT FOR YOU?
12400 An insensitivity to word order has the advantage that lexical items
12500 representing different parts of speech can represent the same
12600 concept,e.g. "work" as noun or as verb. But a price is paid for this
12700 resilience and elasticity. We found from experience that, since
12800 English relies heavily on word order to convey the meaning of it
12900 messages, the average penalty of misunderstanding (to be
13000 distinguished from ununderdstanding), was too great. Hence in PARRY2,
13100 as will be described shortly, the patterns require a specified word
13200 order.
13300 For high-complexity problems it is necessary to have
13400 constraints. Diagnostic psychiatric interviews (and especially
13500 those conducted over teletypes) have several natural constraints.
13600 First, clinicians are trained to ask certain questions in certain
13700 ways. This limits the number of patterns required to recognize
13800 utterances about each topic. Second, only a few hundred standard
13900 topics are brought up by interviewers who are trained to use everyday
14000 expressions and especially those used by the patient himself. When
14100 the interview is conducted over teletypes, expressions tend to be
14200 shortened since the interviewer tries to increase the information
14300 transmission rate over the slow channel of a teletype. Finally,
14400 teletyped interviews represent written utterances. Utterances are
14500 known to be highly redundant and unrecognized words can be ignored
14600 without losing the meaning of the message. Utterances are loaded with
14700 idioms, cliches, pat phrases, etc. - all being easy prey for a
14800 pattern-matching approach. It is time-wasting and usually futile to
14900 try to decode an idiom by analyzing the meanings of its individual
15000 words.
15100 We shall now describe the pattern-matching functions of the
15200 algorithm in some detail. (See Fig. 1 for a diagram of the overall
15300 flow of control).
15400
15500
15600 OVERVIEW
15700
15800 PARRY2 has two primary modules. The first attempts to
15900 RECOGNIZE the input and the second RESPONDS. This paper is primarily
16000 about the RECOGNIZE module. It functions independently of the
16100 RESPOND module except in the case of pronoun references, which the
16200 RESPOND module provides to the RECOGNIZER on request.
16300 The recognition module has 4 main steps:
16400 1) Identify the words in the question and convert them to
16500 internal synonyms.
16600 2) Break the input into segments at certain bracketing words.
16700 3) Match each segment (independently) to a stored pattern.
16800 4) Match the resulting list of recognized segments to a stored
16900 compound pattern.
17000 Each of these steps, except the segmenting, throws away what it
17100 cannot identify. When an utterance refers to a topic which PARRY2
17200 has some ability to recognize, this increases the chance that it will
17300 ultimately be recognized. Occasionally, a reference to an unknown
17400 topic is mis-recognized as some familiar topic.
17500
17600 PREPROCESSING
17700
17800 Each word in the input expression is first looked up in a
17900 dictionary of(currently) about 1900 entries which, for the sake of
18000 speed, is maintained in core during run-time. (The complete language
18100 recognition process requires less than one second of real time on a
18200 time-shared DEC PDP-10). The dictionary, which was built empirically
18300 from thousands of teletyped interviews with previous versions of the
18400 model, consists of words, groups of words, and names of word-classes
18500 they can be translated into. (See Fig. 2 for illustrative examples
18600 of dictionary entries). The size of the dictionary is determined by
18700 the patterns, i.e. a word is not included unless it appears in one or
18800 more patterns. Entries in the dictionary reflect PARRY2's main
18900 interests. If a word in the input is not in the dictionary, it is
19000 dropped from the pattern being formed. Thus if the input were:
19200 WHAT IS YOUR CURRENT OCCUPATION?
19300 and the word "current" were not in the dictionary, the pattern at
19400 this stage becomes:
19500 ( WHAT IS YOUR OCCUPATION )
19600 The question-mark is thrown away as redundant since questions are
19700 recognized by word order. (A statement followed by a question mark
19800 (YOU GAMBLE?) is responded to in the same way as that statement
19900 followed by a period.) Synonymic translations of words are made so
20000 that the pattern becomes, for example:
20100 ( WHAT BE YOU JOB )
20200 Some groups of words (i.e. idioms) are translated as a group
20300 so that, for example, "for a living" becomes "for job". Certain other
20400 juxtaposed words are contracted into a single word, e.g. "place of
20500 birth" becomes "birthplace". This is done to deal with groups of
20600 words which are represented as a single element in the stored
20700 pattern, thereby preventing segmentation from occurring at the wrong
20800 places, such as at a preposition inside an idiom or phrase. Besides
20900 these contractions, certain expansions are made so that for example,
21000 "DON'T" becomes "DO NOT" and "I'D" becomes "I WOULD".
21100 Misspellings can be the bane of teletyped interviews for an
21200 algorithm. Here they are handled in two ways. First, common
21300 misspellings of important words are simply put in the dictionary.
21400 Thus "yoy" is known to mean "you". The apostrophe is often omitted
21500 from contractions so most contractions are recognized with or without
21600 it. These common misspellings were gathered from over 4000 interviews
21700 with earlier versions of the paranoid model.
21800 Second, there are 5 common forms of typing error which are
21900 checked systematically. These are:
22000 1) Doubled letter
22100 2) Extraneous letter
22200 3) Forgetting to hold the "shift key" for an apostrophe
22300 4) Hitting a nearby key on the keyboard
22400 5) Transposing two letters in a word
22500 The first three errors can be corrected by deleting the offending
22600 character from the word. This is accomplished by deleting each
22700 character in turn until the word becomes a recognized word. The
22800 fourth type of error is only checked for 10 of the more common near
22900 misses. These were also empirically determined and include letter
23000 pairs like (T Y), (O P), (A S), and (N M). These methods are all
23100 based on typing errors, but they also correct some legitimate English
23200 spelling errors. Two-letter transposition corrects, for example,
23300 "beleive" to "believe".
23400
23500 SEGMENTING
23600
23700 Another weakness in the crude pattern matching of PARRY1 was
23800 that it took the entire input expression as its basic processing
23900 unit. Hence if only two words were recognized in an eight word
24000 utterance, the risk of misunderstanding was great. We needed a way of
24100 dealing with units shorter than the entire input expression.
24200 Expert telegraphists stay six to twelve words behind a
24300 received message before transcribing it. Translators wait until they
24400 have heard 4-6 words before they begin translating. Aided by a
24500 heuristic from work in machine-translation (Wilks, 1973 ), we devised
24600 a way of bracketing the pattern constructed up to this point into
24700 shorter segments using prepositions, wh-forms, certain verbs, etc. as
24800 bracketing points. (A list of the bracketing terms appears in Fig.
24900 3). These tend to separate prepositional phrases and embedded
25000 clauses from the main clause. The new pattern formed is termed
25100 either "simple", having no delimiters within it, or "compound", i.e.,
25200 being made up of two or more simple patterns. A simple pattern might
25300 be:
25400 ( WHAT BE YOU JOB )
25500 whereas a compound pattern would be:
25600 (( WHY BE YOU ) ( IN HOSPITAL ))
25700 Our experience with this method of segmentation shows that compound
25800 patterns from teletyped psychiatric dialogues rarely consist of more
25900 than three or four segments.
26000 After certain verbs (See Fig. 4) a bracketing occurs to
26100 replace the commonly omitted "THAT", such that:
26200 ( I THINK YOU BE AFRAID )
26300 becomes
26400 (( I THINK ) ( YOU BE AFRAID ))
26500
26600 MATCHING INDIVIDUAL SEGMENTS
26700
26800 Conjunctions serve only as markers for the segmenter and they
26900 are dropped out after segmentation.
27000 Negations are handled by extracting the "NOT" from the
27100 segment and assigning a value to a global variable which indicates
27200 that the expression is negative in form. When a pattern is finally
27300 matched, this variable is consulted. Some patterns have a pointer to
27400 a pattern of opposite meaning if a "NOT" could reverse their
27500 meanings. If this pointer is present and a "NOT" was found, then
27600 the pattern matched is replaced by its opposite, e.g. ( I not trust
27700 you ) is replaced by the pattern ( I mistrust you ). We have not yet
27800 observed the troublesome case of "he gave me not one but two
27900 messages". (There is no need to scratch where it doesn't itch).
28000 Substitutions are also made in certain cases. Some segments
28100 contain pronouns which could stand for a number of different things
28200 of importance to PARRY2. As we mentioned in the introduction,
28300 there is one case in which the response functions of memory aid in
28400 language recognition. One function of memory is to keep track of
28500 the context in order to give pronouns and other anaphoras a correct
28600 interpretation. For example, the segment:
28700 ( DO YOU AVOID THEM )
28800 could refer to the Mafia, or racetracks, or other patients, depending
28900 on the context. When such a segment is encountered, the pronoun is
29000 replaced by its current anaphoric value as determined by the response
29100 functions, and a more specific segment such as:
29200 ( DO YOU AVOID MAFIA )
29300 is looked up.
29400 There are other utterances, such as "Why did you do that?" or
29500 just "Why?" (which might be regarded as a massive ellipsis), which
29600 clearly refer back to previous utterances. These utterances match
29700 very general patterns which identify the type of question without
29800 indicating the exact topic. The response function which responds to
29900 "Why?" consults the context to produce an appropriate answer.
30000 The algorithm now attempts to match the segments with stored
30100 simple patterns which currently number about 1700. First a complete
30200 and perfect match is sought. When a match is found, the stored
30300 pattern name has a pointer to the name of a response function in
30400 memory which decides what to do further. If a match is not found,
30500 further transformations of the segment are carried out and a "fuzzy"
30600 match is tried.
30700 For fuzzy matching at this stage, we adopted the heuristic
30800 rule of dropping elements in the segment one at a time and attempting
30900 a match each time. This heuristic allows ignoring familiar words in
31000 unfamiliar contexts. For example, "well" is important in "Are you
31100 well?" but meaningless in "Well are you?".
31200 Deleting one element at a time results in, for example, the
31300 pattern:
31400 ( WHAT BE YOU MAIN PROBLEM )
31500 becoming successively:
31600 (a) ( BE YOU MAIN PROBLEM )
31700 (b) ( WHAT YOU MAIN PROBLEM )
31800 (c) ( WHAT BE MAIN PROBLEM )
31900 (d) ( WHAT BE YOU PROBLEM )
32000 (e) ( WHAT BE YOU MAIN )
32100 Since the stored pattern in this case matches (d), (e) would not be
32200 constructed. We found it unwise to delete more than one element
32300 since our segmentation method usually yields segments containing a
32400 small (1-4) number of words.
32500 Dropping an element at a time provides a probability
32600 threshold for fuzzy matching which is a function of the length of
32700 the segment. If a segment consists of five elements, four of the five
32800 must be present in a particular order (with the fifth element missing
32900 in any position) for a match to occur. If a segment contains four
33000 elements, three must match - and so forth.
33100
33200 COMPOUND-PATTERN MATCH
33300
33400 When more than one simple pattern is detected in the input, a
33500 second matching is attempted against about 500 compound patterns.
33600 Certain patterns, such as ( HELLO ) and ( I THINK ), are dropped
33700 because they are considered meaningless. If a complete match is not
33800 found, then simple patterns are dropped, one at a time, from the
33900 complex pattern. This allows the input,
34000 (( HOW DO YOU COME ) ( TO BE ) ( IN HOSPITAL ))
34100 to match the stored pattern,
34200 (( HOW DO YOU COME ) ( IN HOSPITAL )).
34300
34400 If no match can be found at this point, the algorithm has
34500 arrived at a default condition and the appropriate response functions
34600 decide what to do. For example, in a default condition, the model
34700 may assume control of the interview, asking the interviewer a
34800 question, continuing with the topic under discussion or introducing a
34900 new topic.
35000 An annotated example of a diagnostic psychiatric interview is
35100 presented in the appendix.
35200
35300
35400 ADVANTAGES AND LIMITATIONS
35500
35600 As mentioned, one of the main advantages of a
35700 pattern-matching strategy is that it can ignore as irrelevant both
35800 some of what it recognizes and what it does not recognize at all.
35900 There are several million words in English, each possessing one to
36000 over a hundred senses. To construct a machine-usable word
36100 dictionary of this magnitude is out of the question at this time.
36200 Recognition of natural language input such as described above, allows
36300 real-time interaction in a dialogue since it avoids becoming
36400 ensnarled in combinatorial disambiguations and long chains of
36500 inferencing which would slow a dialogue algorithm down to
36600 impracticality, if it could even function at all. The price paid for
36700 pattern-matching is that sometimes, but rarely, ambiguities slip
36800 through. Eventually a rewrite interpreter must be added to pattern-
36900 matching rules to deal with ambiguities. (Enea and Colby,1973).
37000 A drawback to PARRY1 was that it reacted to the first pattern
37100 it found in the input rather than characterizing the input as fully
37200 as possible and then deciding what to do based on a number of tests.
37300 Another practical difficulty with PARRY1 from a programmer's
37400 viewpoint, was that elements of the patterns were strung out in
37500 various procedures throughout the algorithm. It was often a
37600 considerable chore for the programmer to determine whether a given
37700 pattern was present and precisely where it was. In the
37800 above-described method, the patterns are all collected in one part of
37900 the data-base where they can easily be examined.
38000 Concentrating all the patterns in the data base gives PARRY2
38100 a limited "learning" ability. When an input fails to match any
38200 stored pattern or matches an incorrect one, as judged by a human
38300 operator, a pattern which matches the input can be put into the
38400 data-base automatically. If the new pattern has the same meaning as
38500 a previously stored pattern, the human operator must provide the name
38600 of the appropriate response function. If he doesn't remember the
38700 name, he may try to rephrase the input in a form recognizable to
38800 PARRY2 and it will name the response function associated with the
38900 rephrasing. These mechanisms are not "learning" in the commonly used
39000 sense but they do allow a person to transfer his knowledge into
39100 PARRY2's data-base with very little effort.
39200 Informal observation thus far shows PARRY2's linguistic
39300 recognition abilities to be quite superior to PARRY1's. A more
39400 systematic and quantitative evaluation of performance will be carried
39500 out in the near future.
39600
39700
39800 REFERENCES
39900
40000 Colby, K.M., Weber, S., and Hilf, F.D. (1971). Artificial Paranoia.
40100 ARTIFICIAL INTELLIGENCE, 2, 1-25.
40200
40300 Enea, H. and Colby, K.M. (1973). Idiolectic Language-analysis
40400 For Understanding Doctor-patient Dialogues. THIRD
40500 INTERNATIONAL JOINT CONFERENCE IN ARTIFICIAL INTELLIGENCE.
40600 278-284.
40700
40800 Wilks, Y. (1973). An artificial intelligence approach to machine
40900 translation. In COMPUTER MODELS OF THOUGHT AND
41000 LANGUAGE, Schank, R. and Colby, K.M., Eds., W.H. Freeman,
41100 San Francisco.